Goto

Collaborating Authors

 global weight



we appreciate all reviewers for their correct summarisation of our contributions as (1) Introduce the ice-start problem

Neural Information Processing Systems

We thank all the reviewers for their time to provide comments and the positive feedback from reviewer 1 and 4. Also, In the following, we will address the questions proposed by each reviewer. This demonstrates the broad applicability of the proposed method. For AUIC, we used its full name in line 205. However, we would like to highlight our novel inference method for BELGAM. We will add a detailed introduction of this encoder in the appendix of the revised version.


Federated Learning with Discriminative Naive Bayes Classifier

Torrijos, Pablo, Alfaro, Juan C., Gámez, José A., Puerta, José M.

arXiv.org Artificial Intelligence

Federated Learning has emerged as a promising approach to train machine learning models on decentralized data sources while preserving data privacy. This paper proposes a new federated approach for Naive Bayes (NB) classification, assuming discrete variables. Our approach federates a discriminative variant of NB, sharing meaningless parameters instead of conditional probability tables. Therefore, this process is more reliable against possible attacks. We conduct extensive experiments on 12 datasets to validate the efficacy of our approach, comparing federated and non-federated settings. Additionally, we benchmark our method against the generative variant of NB, which serves as a baseline for comparison. Our experimental results demonstrate the effectiveness of our method in achieving accurate classification.


Sampling-Free Probabilistic Deep State-Space Models

Look, Andreas, Kandemir, Melih, Rakitsch, Barbara, Peters, Jan

arXiv.org Machine Learning

Many real-world dynamical systems can be described as State-Space Models (SSMs). In this formulation, each observation is emitted by a latent state, which follows first-order Markovian dynamics. A Probabilistic Deep SSM (ProDSSM) generalizes this framework to dynamical systems of unknown parametric form, where the transition and emission models are described by neural networks with uncertain weights. In this work, we propose the first deterministic inference algorithm for models of this type. Our framework allows efficient approximations for training and testing. We demonstrate in our experiments that our new method can be employed for a variety of tasks and enjoys a superior balance between predictive performance and computational budget.


When a CBR in Hand is Better than Twins in the Bush

Ahmed, Mobyen Uddin, Barua, Shaibal, Begum, Shahina, Islam, Mir Riyanul, Weber, Rosina O

arXiv.org Artificial Intelligence

AI methods referred to as interpretable are often discredited as inaccurate by supporters of the existence of a trade-off between interpretability and accuracy. In many problem contexts however this trade-off does not hold. This paper discusses a regression problem context to predict flight take-off delays where the most accurate data regression model was trained via the XGBoost implementation of gradient boosted decision trees. While building an XGB-CBR Twin and converting the XGBoost feature importance into global weights in the CBR model, the resultant CBR model alone provides the most accurate local prediction, maintains the global importance to provide a global explanation of the model, and offers the most interpretable representation for local explanations. This resultant CBR model becomes a benchmark of accuracy and interpretability for this problem context, and hence it is used to evaluate the two additive feature attribute methods SHAP and LIME to explain the XGBoost regression model.


A Feedback Integrated Web-Based Multi-Criteria Group Decision Support Model for Contractor Selection using Fuzzy Analytic Hierarchy Process

Afolayan, Abimbola Helen, Ojokoh, Bolanle Adefowoke, Adetunmbi, Adebayo

arXiv.org Artificial Intelligence

The construction sector constitutes one of the most important sectors in the economy of any country. Many construction projects experience time and cost overruns due to the wrong choice of contractors. In this paper, the feedback integrated multi-criteria group decision support model for contractor selection was proposed. The proposed model consists of two modules; technical evaluation module and financial evaluation module. The technical evaluation module is employed to screen out the contractors to a smaller set of acceptable contractors and the functionality of the module is based on the Fuzzy Analytic Hierarchy Process (FAHP).


Dynamic Stale Synchronous Parallel Distributed Training for Deep Learning

Zhao, Xing, An, Aijun, Liu, Junfeng, Chen, Bao Xin

arXiv.org Machine Learning

Deep learning is a popular machine learning technique and has been applied to many real-world problems. However, training a deep neural network is very time-consuming, especially on big data. It has become difficult for a single machine to train a large model over large datasets. A popular solution is to distribute and parallelize the training process across multiple machines using the parameter server framework. In this paper, we present a distributed paradigm on the parameter server framework called Dynamic Stale Synchronous Parallel (DSSP) which improves the state-of-the-art Stale Synchronous Parallel (SSP) paradigm by dynamically determining the staleness threshold at the run time. Conventionally to run distributed training in SSP, the user needs to specify a particular staleness threshold as a hyper-parameter. However, a user does not usually know how to set the threshold and thus often finds a threshold value through trial and error, which is time-consuming. Based on workers' recent processing time, our approach DSSP adaptively adjusts the threshold per iteration at running time to reduce the waiting time of faster workers for synchronization of the globally shared parameters, and consequently increases the frequency of parameters updates (increases iteration throughput), which speedups the convergence rate. We compare DSSP with other paradigms such as Bulk Synchronous Parallel (BSP), Asynchronous Parallel (ASP), and SSP by running deep neural networks (DNN) models over GPU clusters in both homogeneous and heterogeneous environments. The results show that in a heterogeneous environment where the cluster consists of mixed models of GPUs, DSSP converges to a higher accuracy much earlier than SSP and BSP and performs similarly to ASP. In a homogeneous distributed cluster, DSSP has more stable and slightly better performance than SSP and ASP, and converges much faster than BSP.


A Bi-layered Parallel Training Architecture for Large-scale Convolutional Neural Networks

Chen, Jianguo, Li, Kenli, Bilal, Kashif, Zhou, Xu, Li, Keqin, Yu, Philip S.

arXiv.org Machine Learning

Abstract-- Benefitting from large-scale training datasets and the complex training network, Convolutional Neural Networks (CNNs) are widely applied in various fields with high accuracy. However, the training process of CNNs is very time-consuming, where large amounts of training samples and iterative operations are required to obtain high-quality weight parameters. In this paper, we focus on the time-consuming training process of large-scale CNNs and propose a Bi-layered Parallel Training (BPT-CNN) architecture in distributed computing environments. BPT-CNN consists of two main components: (a) an outer-layer parallel training for multiple CNN subnetworks on separate data subsets, and (b) an inner-layer parallel training for each subnetwork. In the outer-layer parallelism, we address critical issues of distributed and parallel computing, including data communication, synchronization, and workload balance. A heterogeneousaware Incremental Data Partitioning and Allocation (IDPA) strategy is proposed, where large-scale training datasets are partitioned and allocated to the computing nodes in batches according to their computing power. To minimize the synchronization waiting during the global weight update process, an Asynchronous Global Weight Update (AGWU) strategy is proposed. In the inner-layer parallelism, we further accelerate the training process for each CNN subnetwork on each computer, where computation steps of convolutional layer and the local weight training are parallelized based on task-parallelism. We introduce task decomposition and scheduling strategies with the objectives of thread-level load balancing and minimum waiting time for critical paths. Extensive experimental results indicate that the proposed BPT-CNN effectively improves the training performance of CNNs while maintaining the accuracy. Index Terms--Big data, bi-layered parallel computing, convolutional neural networks, deep learning, distributed computing. Convolutional Neural Network (CNN) algorithm is an important branch of DL.